-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Dashboard addon to version 1.8.0 and align /ui redirect with it #53046
Update Dashboard addon to version 1.8.0 and align /ui redirect with it #53046
Conversation
/assign @lavalamp |
@bryk - can you take a look? |
We are pretty sure that failed tests are related to issue #53382. |
e9a1a36
to
8794c1c
Compare
db88c31
to
dc866d8
Compare
@roberthbailey With @floreks we have managed to fix failing tests. Can you take a look? |
The issue mentioned earlier was fixed by a change in our initial container. |
@@ -31,12 +36,26 @@ spec: | |||
memory: 100Mi | |||
ports: | |||
- containerPort: 9090 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why didn't this port change if everything is shifting to 8443?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be changed. This option should expose port 9090
of this container, right? Is this overridden by using expose
option in dockerfile? This container will actually only expose port 8443
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed it before. Fixed now.
Please squash your commits. /assign @mikedanese to look at the rbac changes. |
rules: | ||
- apiGroups: [""] | ||
resources: ["secrets"] | ||
verbs: ["create", "watch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the watch used for? cc @kubernetes/sig-auth-pr-reviews
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not something I'd recommend allowing (this is equivalent exposure to listing all secrets in the namespace). If the dashboard was in its own namespace, this would still not be ideal, but could be more palatable, but in kube-system, it's not a reasonable default policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to restrict it even further but there is no option to define rule to watch on a single resource changes. Dashboard is actually only watching on single dashboard exclusive resource (secret named kubernetes-dashboard-key-holer
). https://github.com/kubernetes/dashboard/blob/master/src/app/backend/sync/secret.go#L169
Since we are not exposing any endpoint that could allow to exploit that, then only stealing token from inside the pod would be an option to somehow exploit that permission.
PS. It is still a huge step forward for us from full cluster admin permissions that were granted previously. We'll update them if at some point it will be able to restrict rule to watch on single resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until individual watch authz is available, you can do individual gets of the secret or mount it into the dashboard pod and react to changes in the mounted content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with liggitt. Don't use watch, just poll with gets. This is what kubelet does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use multiple decryption keys but how does this solve the issue of syncing them with all replicas?
Let's consider the use case where Dashboard is behind load balancer and there are more than 1 replica. Let's say that user has logged in and token was encrypted with locally synchronized key of backend-1. Then user gets redirected to backend-2 without knowing that and it might not have this key synchronized yet if we use polling mechanism with i.e. 5 min period. In result user gets forced logged out.
Second use case. We have 1 replica, key is sychronized with a secret. Secret gets deleted manually. In the meantime Dashboard is scaled up to 2 replicas. Second replica can not find secret and generates new encryption key, and stores it in a secret. Because of polling we have now 2 replicas with different keys that will be out of sync for a few minutes.
Currently when secret gets deleted it is immediately recreated based on local copy stored in one of the replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then user gets redirected to backend-2 without knowing that and it might not have this key synchronized yet if we use polling mechanism with i.e. 5 min period. In result user gets forced logged out.
- secret contains [key1], all replicas use key1 for encrypting and decrypting
- update secret to contain [key1, key2]. as replicas observe the new secret, they use key1 for encrypting and attempt decrypting with key1 and key2
- wait at least as long as your secret distribution period, then update the secret to contain [key2, key1]. as replicas observe the new secret, they use key2 for encrypting and attempt decrypting with key2 and key1
- wait as long as your cookie expiration period (so cookies created using key1 would no longer be valid), then update the secret to contain [key2]
the wait at step 2 is required to let all replicas observe the new decryption key before starting to use it. alternately, they could react to decryption failures by repolling the secret to see if there is a new key for them to use.
the wait at step 3 is required to avoid logging out users that logged in and have a session that can only be decrypted by the previous key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we do not have key rotation mechanism implemented, however there is a fallback mechanism that forces synchronous update of secret in case decryption fails.
Still with current implementation I think polling would not work and it would have to be extended to support storing multiple keys in a secret (even that would need some rework to work properly).
Case in which this would not work with polling is:
- Start with 1 replica, it generates and creates a secret with key-1.
- Secret gets deleted.
- Scale replicas to 2. New replica creates secret with a new key-2. It does not have information about old key-1.
- Request goes to 2nd replica. Token encrypted with key-1 can not be decrypted with new key-2. User is logged out.
If points 2-4 happen during polling interval and new replica won't be able to synchronize both keys then there is a problem. Currently this problem is very unlikely to happen as thanks to watch secret gets immediately recreated from local copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Case in which this would not work with polling is:
- Start with 1 replica, it generates and creates a secret with key-1.
- Secret gets deleted.
- Scale replicas to 2. New replica creates secret with a new key-2. It does not have information about old key-1.
- Request goes to 2nd replica. Token encrypted with key-1 can not be decrypted with new key-2. User is logged out.
Yes, deleting state disrupts rolling update. The same thing would happen with watch (unless you had replica-1 repopulate the secret with potentially old keys, which I wouldn't expect if the secret is supposed to be the authoritative shared state).
Should we implement behaviour described by @liggitt
Moving to polling seems reasonable for such a slow-moving object, especially given the security tradeoff of granting complete access to all kube-system secrets. You could even do a rate-limited re-poll if a decode error was encountered to stay responsive to key changes on demand.
move it to another namespace (can Dashboard be cluster-service then?)
That would be ideal, but I think the add-on manager only targets the kube-system namespace today
do @floreks concerns sound reasonable and there is another way to go?
In order for existing user sessions to continue working, and preserve the ability to scale up/down replicas, you have to keep old decrypting keys available in shared state (in the secret) as long as your user sessions last.
dabcc19
to
53cecab
Compare
/retest |
/lgtm |
/approve no-issue |
This has been approved for an extension until the end of Dec 1. |
[MILESTONENOTIFIER] Milestone Pull Request Current @bryk @lavalamp @liggitt @maciaszczykm @mikedanese @roberthbailey @zmerlynn Note: This pull request is marked as Example update:
Pull Request Labels
|
@floreks @maciaszczykm Can you work on getting approvals from |
/assign @lavalamp Could you take a look? |
ACK. Needs OWNERS approval |
ACK. Needs OWNERS approval |
The apimachinery is no worse than it was before. Thanks for noting it will be removed in a later release. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, liggitt, maciaszczykm, roberthbailey Associated issue requirement bypassed by: roberthbailey The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
…#53046-upstream-release-1.8 Automated cherry pick of #53046
What this PR does / why we need it: In Dashboard 1.8.0 we have introduced a couple of changes (security, settings, new resources etc.) and fixed a lot of bugs. You can check release notes at https://github.com/kubernetes/dashboard/releases/tag/v1.8.0.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
Release note: